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Abstract 

Noether's theorem links the symmetries of a quantum system with its conserved quantities, 
and is a cornerstone of quantum mechanics. Here we prove a version of Noether's theorem for 
Markov processes. In quantum mechanics, an observable commutes with the Hamiltonian if 
and only if its expected value remains constant in time for every state. For Markov processes 
that no longer holds, but an observable commutes with the Hamiltonian if and only if both its 
expected value and standard deviation are constant in time for every state. 

1 Introduction 

There is a rich analogy between quantum mechanics and what one might call 'stochastic mechanics', 
where probabilities take the place of amplitudes [1]. In quantum mechanics, we specify the state 
of a system by an element tjj oi a Hilbert space, and describe its time evolution by the Schrodinger 
equation: 

where iJ is a self-adjoint linear operator called the Hamiltonian. For Markov processes, we specify 
the state of a system by a probability distribution -0 on some measure space, and describe its time 
evolution by the so-called 'master equation' 



where H is a, linear operator variously known as a 'stochastic Hamiltonian', 'transition rate matrix' 
or 'intensity matrix'. In quantum mechanics, it is well-known that conserved quantities correspond 
to self-adjoint operators that commute with the Hamiltonian. Here we present a similar result for 
Markov processes. 

To avoid technicalities and focus on the basic idea, we start by considering Markov processes, or 
technically 'Markov semigroups', where the measure space X is just a finite set equipped with its 
counting measure. Later we consider the general case. We begin by reviewing some basic facts and 
setting up some definitions; for details see Stroock [4]. 



When X is a finite set, a probability distribution on X is a function tp: X ^i' M. such that 
ipi > for all i g X and 

i 

We say an operator U : M"^ — )► M'^ is stochastic if it is linear and it maps probability distributions 
to probability distributions. A Markov semigroup consists of operators U{t) : R"^ — ^ R"^, one for 
each t G [0, cxd), such that: 

(i) U{t) is stochastic for alH > 

(ii) U{t) depends continuously on t. 

(iii) U{s + t) = U{s)U{t) for ah s, i > 0. 

(iv) f/(0) = /. 

Any Markov semigroup may be written as U{t) = cxp{tH) for a unique linear operator H : M.-^ — > 
R^. Moreover, this operator H is infinitesimal stochastic, meaning that if we write it as a matrix 
using the canonical basis for R^, then: 

(i) Hij > for all i,jeX with i ^ j. 

(ii) 'Z-.ex Htj =OioT allieX. 

For i ^ j the matrix entry Hij is the probability per time of a transition from the state j 6 X to 
the state i G X. Condition (i) says that these probabilities are nonnegative. Condition (ii) then 
says that the diagonal entry Ha is minus the probability per time of a transition out of the state i. 
Conversely, for any infinitesimal stochastic operator H: M.^ -^ M.^ , exp(iif) is a Markov semi- 
group. Given any function -0 : X — !• R, we obtain a solution of the master equation: 

with H as Hamiltonian and ip as the initial value by setting 

tjj{t) = cxp{tH)'ip. 

If -0 is a probability distribution, then so is tp{t) for all i > 0. 

Next we turn to Noether's theorem. There are many theorems of this general type, all of which 
relate symmetries of a physical system to its conserved quantities. Noether's original theorem applies 
to the Lagrangian approach to classical mechanics, and obtains conserved quantities from symmetries 
of the Lagrangian [2, 3]. In the Hamiltonian approach to classical mechanics, any observable having 
vanishing Poisson brackets with the Hamiltonian both generates symmetries of the Hamiltonian and 
is a conserved quantity. This idea extends to quantum mechanics if we replace Poisson brackets 
by commutators. It is this last form of Noether's theorem, somewhat removed from the original 
form but very easy to prove, that we now generalize to Markov processes. For a Markov process, an 
observable will commute with the Hamiltonian if and only if both its expected value and that of its 
square are constant in time for every state. 

Here an observable is a function O: X ^M. assigning a real number Oj to each state i G X. 
We identify O with the diagonal matrix with iith entry equal to O, and define its expected value 
for a probability distribution ip to be 



(o,V) = ^o,v, 



iex 
Our Noether theorem for Markov processes may then be stated as follows: 




Proposition (Noether's Theorem, Stochastic Version) . Let X be a finite set, let H : M.-^ -^ K'^ be 
an infinitesimal stochastic operator, and let O be an observable. Then [0,H] ~ if and only if for 
all probability distributions ip{t) obeying the master equation ■jripit) = Htp[t), the expected values 
{Ojipit)) and {0'^,ijj{t)) are constant. 

For comparison, in the quantum version, both the Hamiltonian and the observable are given 
by self-adjoint operators on a Hilbert space. To avoid technicalities, we only state the version for 
bounded operators: 

Proposition (Noether's Theorem, Quantum Version). Let H and O be bounded self-adjoint oper- 
ators on a Hilbert space. Then [0,H] = if and only if for all states 'ip{t) obeying Schrodinger's 
equation ^V'(^) = —iHip{t) the expected value {ip{t),Oip{t)) is constant. 

The similarity between these two results is striking, but this also illuminates a key difference: the 
Markov version requires not just the expected value of the observable to be constant, but also the 
expected value of its square. This condition cannot be weakened to only require that the expected 
value be constant. Observe that if 

/ 1 ( 
if = -2 I and O = 

\0 1 0/ 

then for V(0) = (0, 1, 0), we have ^(O, V) = 0, but [O, H] ^ 0. 

Indeed, in both the quantum and stochastic cases, the time derivative of the expected value of 
an observable O is expressed in terms of the commutator [O, H]. In the quantum case we have 

j^{4it),o^p{t)) = ^tm),[o,H]ij{t)) 

for any solution 'ip{t) of Schrodingcr's equation. The polarisation identity then implies that the 
right-hand side vanishes for all solutions if and only if [O, H] = 0. In the stochastic case we have 

|(0,^(i)) = (l,[0,i/]^(t)) 

for any solution ip(t) of the master equation. However, in this case the right-hand side can vanish 
for all solutions tl){t) without [O, H] = 0, as shown by the above example. To ensure [O, H] = we 
need a supplementary hypothesis, such as the vanishing of -^{O^, ipit)). 

What is the meaning of this supplementary hypothesis? Including it means that not only is the 
expected value of the observable O conserved, but so is its variance, defined by 

Of course the variance is the standard deviation of O, so an observable commutes with the Hamil- 
tonian if and only if both its expected value and standard deviation are constant in time for every 
state. 

2 Proof 

While proving Proposition 1 it is enlightening to introduce some other equivalent characterizations of 
conserved quantities. For this we shall introduce the transition graph of an infinitesimal stochastic 
operator. Suppose X is a finite set and H: R-^ — > R^ is an infinitesimal stochastic operator. We 
may form a directed graph with the set X as vertices and an edge from j to i if and only if Hij ^ 0. 
We say i and j are in the same connected component of this graph if there is a sequence of 
vertices j = fco, fci, . . . , fc„ = i such that for each < £ < n there is either an edge from k£ to fc^+i 
or from kg^i to ki. 

Our Noether theorem is the equivalence of (i) and (iii) in this result: 



Theorem 1. Let X be a finite set, let H : M^ — > R'^ he an infinitesimal stochastic operator, and 
let O he an observahle. Then the following are equivalent: 

(i) [0,H]^Q. 

(a) -^{f{0),ilj{t)) = for all polynomials /: R — > R and all ip satisfying the master equation with 
Hamiltonian H . 

(Hi) -^{0,ip{t)) ~ ■^{O'^ ,ip{t)) ^ for all ip satisfying the master equation with Hamtiltonian H . 

(iv) Oi ~ Oj if i and j lie in the same connected component of transition graph of H . 

Proof Wc prove (i) => (ii) ^ (iii) => (iv) => (i). 

(i) ^ (ii) As H commutes with O, the Taylor expansion of / shows that H commutes with 
f{0) whenever / is a polynomial. From this and the master equation we have 

^(/(0),^(t)) = {f{0),f^m) - {f{0),H4'{t)) = {l,f{0)Hm) = {l,Hf{OMt)) 
But H is infinitesimal stochastic, so 

{l,HfiO)iit)) - Y. H.if{0,)Ut) - Y. (e ^'j)/(Oj)'^^ w = "■ 

(ii) ^ (iii) Both O and O^ are polynomials in O. 

(iii) ^> (iv) Suppose that i,j ^ X lie in the same connected component. We claim that then 
Oi = Oj. Clearly it suffices to show Oi = Oj whenever iJy ^ 0. And for this, this it is enough to 
show that for any j & X we have 

^(0,-0,)'^..— 0. 

iex 

This is enough, as each term in this sum is nonnegative: when i = j we have Oj —Oi = 0, while when 
i 7^ j, both {Oj — Oi)^ and Hij are nonnegative — the latter because H is infinitesimal stochastic. 
Thus when their sum is zero each term {Oj — Oi)^Hij is zero. But this means that if Hij is nonzero, 
then Oi — Oj, and this proves the claim. 

Expanding the above expression then, we have 

YiOj - O^fK,, = 0| E ^^ - 20, Y. O^H^, + Y OlH,,. 

i€X i£X i£X i£X 

The three terms here are each zero: the first because H is infinitesimal stochastic, and the latter 
two since, if Cj is the probability distribution with value 1 at j £ X and elsewhere, then 



d 
It 



(0,exp(tiJ)ej) 



{0,He,) = YO^H,: 



iex 



and 

j{0',cxp{tH)e,) = {0\He,) = ^O^^,,^ 

*=0 iex 

and by hypothesis these two derivatives are both zero. 

(iv) =^ (i) When Hij is nonzero, the states i and j lie in the same component, so Oi = Oj. 
Thus for each i,j G X: 

[O, H],j = {OH - HO).,j = 0.,H,j - H^jOj = (0« - Oj)H^j =0. D 



3 Generalization 

In this section we generalize Noether's theorem for Markov processes from the case of a finite set of 
states to a more general measure space. This seems to require some new ideas and techniques. 

Suppose that X is a cr-finite measure space with a measure we write simply as dx. Then proba- 
bility distributions ip on X \ic m L^{X). We define an observable O to be any element of the dual 
Banach space L°°{X), allowing us to define the expected valued of O in the probability distribution 
■0 to be 

(O,^) ^ I 0{x)iP{x)dx. 

X 

We can also think of an observable O as a bounded operator on L^{X), namely the operator of 
multiplying by the function O. 

Let us say an operator U: L^{X) — > i^(X) is stochastic if it is linear, bounded, and maps 
probability distributions to probability distributions. Equivalently, U is stochastic if it is linear and 
obeys 

^ > => C/'0 > 

and 

l{Uip){x)dx= f ^{x)dx 

X X 

for all -0 G L^{X). We may also write the latter equation as 

(1,[/0) = (1,V). 

A Markov semigroup is a strongly continuous one-parameter semigroup of stochastic operators 
U{t): L^{X) -^ L^{X). By the Hille-Yosida theorem [5], any Markov semigroup may be written 
as U{t) = cxp{tH) for a unique closed operator H on L-^{X). Any operator H that arises this 
is infinitesimal stochastic. However, such operators are typically unbounded and only densely 
defined. This makes it difficult to work with the commutator [0,H], because the operator O may 
not preserve the domain of H. From our experience with quantum mechanics, the solution is to work 
instead with the commutators [0,cxp{tH)], which arc bounded operators defined on all of L^{X). 
This amounts to working directly with the Markov semigroup instead of the infinitesimal stochastic 
operator H. 

Theorem 2. Suppose X is a a-finite measure space and 

Uit):LHX)^L\X) 

is a Markov semigroup. Suppose O is an observable. Then [O, U{t)] ~ for all t > if and only if 
for all probability distributions ijj on X , the expected values {O, U{t)ip) and (O^, U(t)ip) are constant 
as a function oft. 

This result is an easy consequence of the the following one, which is of interest in its own right, 
since it amounts to a Noether's theorem for Markov chains. A 'Markov chain' is similar to a Markov 
process, but time comes in discrete steps, and at each step the probability distribution ^ evolves via 
-0 H^ Ui) for some stochastic operator U . 

Theorem 3. Suppose X is a a-finite measure space and U : L^{X) — >■ L^{X) is stochastic operator. 
Suppose O is an observable. Then [O, U] = if and only if for all probability distributions ip on X , 

{0,U'iJj) = (0,V) and {0^,UiIj) = {0^,ip). 



Proof. First, suppose [0,U] — 0. Note 

(0,0) = (1,00) 

and since U is stochastic, also 

(1,[/0) = (1,0) 

for all (f> e L^{X). Thus, for any probability distribution ip on X and any n > we have 

(0",[/V') = (l,0"t/7^) = (l,[/0» = (1,0"V') = (0",V). 

Taking n = 1, 2 we get the desired result. 

To prove the converse, we use three lemmas. In all these X is a cr-finite measure space, 
U: L^{X) — > L^{X) is a stochastic operator, and O is an observable. We freely switch between 
thinking of O as a function in L°°{X) and the operator on L-^{X) given by multiplying by that 
function. 

Lemma 1. Suppose that for any compact interval / C R the operator U commutes with XiiO), 
meaning the operator on L^{X) given by multiplying by the characteristic function of 

0-\l) = {xeX: 0{x) e/}. 

Then U commutes with O. 

Proof. The range of the function O is contained in the interior of some interval [—A/, M]. The step 
functions 



/n-E 



iM 



XliM (i + l)M 



are uniformly bounded and converge pointwise to the identity function on the range of O, so by the 
dominated convergence theorem /„(O)'0 — >■ Oip in the L^ norm for all ?/; S L^{^)- Furthermore, 
though we have not written as such, /„ is a linear combination of characteristic functions of compact 
intervals, since we include a single point as a degenerate special case of a compact interval. Thus, 
by hypothesis, U commutes with fn{0). It follows that for every i(; G L^{X), we have 

OU^ = lim„^oo/„(0)(7V' 

= lim„^oo Ufn{0)'4> D 

= UOip. 

Lemma 2. Suppose that for every compact interval I and every ip S L^{X), 

supp(i/0 C 0~\l) =^ suppUi-ip) C 0~\l). 

Then U commutes with every operator x/(0). 

Proof. The set of L^ functions supported in 0~^(/) is the range of the operator x/(0), so the 
hypothesis says that U maps the range of this operator to itself. Given any ip G L^{X) and writing 
p = x/(0), we have 

■ijj = pip + {1 ~ p)^ 

so 

pUip = pUpip + pU{l — p)'ip. 

Since U preserves the range of p and p is the identity on this range, we have pUpi]-) = Upip. Since 
the range of the function O is contained in some interval [— Af, M], can write 1 — p as a linear 
combination of operators Xj{0) for other compact intervals J, again using the fact that a point is 
a degenerate case of a compact interval. Thus U also preserves the range of 1 — p. Since the range 
of 1 — p is the kernel of p, pU{l — p)ip — 0. We thus have 

pUip — Upip. n 



Lemma 3. Suppose that {0,Uij) = {0,i>) and {0'^,Utp) ^ (0^ V) for all ijj G L^{X). Then for 
every compact interval I and every ip e L^{X), 

suppii') C 0-\l) =^ svipp{U'ilj)CO-\l). 

Proof. The range of O is contained in the interior of some interval [—M,M]. Thus we shaU only 
prove the lemma for / contained in (— M, M), since otherwise we can replace / by a smaller compact 
interval with this property without changing 0~^{I). 

Suppose '0 is supported in 0~^{I). We wish to show the same for Uip. It suffices to show that 
Uip is supported in 0~^{J) where J is any compact interval with / C int( J) C {—M, M). Moreover, 
we may assume that ?/; is a probability distribution, since any L^ function supported in 0~^{I) is a 
linear combination of two probability distributions supported in this set. 

To show Uip is supported in J, we write 

n-l 



where 

We thus have 

We shall show that 



i^i = X[iM,il±iH£)(0) V'- 

n-1 



l|(l-Xj(0))C/^.||i<- 



mu 



for some constant c. It follows that 

n-l 

||(i-xj(o))f/V'lli< E 



(1 - X,/(0))C/V« 



< 



E 11^^ 



(1) 



for all n, and thus 

||(i-xj(o))f/V^I|i=o 

so U-ip is supported in 0~^{J). 

To prove (1) we shall use Chebyshev's inequality, which says that the probability of a random 
variable taking a value at least k standard deviations away from its mean is less than or equal to 
1/fc^. However, we first need to convert probability functions on X into probability measures on the 
real line. 

For any function -0 G L'^iX) we can push forward the signed measure tpdx on X via O: X — > M 
to obtain a signed measure on M, which we call ip- Concretely, 4' is characterized by the equation 



fiO{x))4'{x)dx= / /V 



X 



which holds for any bounded measurable function / : M — !■ R. Since the integral at left only depends 
on the value of / in the interval [—M, M], the same is true for the integral at right, so with no harm 
we can restrict / and the integral to [— M, M]. Clearly 



and 



{acj}) = acj) 



7 



for all a 6 R and 0, x G L^{X). If if) is nonncgative then (j) is a nonncgativc measure, and if is a 
probability distribution then is a probability measure. 

Since we are assuming the function i/j is supported in O^^I, the measure ip is supported in /. 
Since 

we have 

^i = X[iivi,(i±i)JH.) V'- 

In what follows we assume f/'i is nonzero, since otherwise (1) is trivial. This implies that we can 
rescale ■i/'i to obtain a probability measure, namely 



11^.11 

Since this probability measure is supported in the interval / n [iM/n, (i + l)M/n), clearly its mean 
lies in /, and its standard deviation is < Al/n. 
By our hypotheses on U, 

is another probability measure on the real line, with the same mean and standard deviation as the one 
above. Thus, by Chebyshev's inequality, the integral of this probability measure over the complement 
of J is less than or equal to {{M/n)/d)'^, where d is the distance of I from the complement of J. In 
other words, 

1 /■ .T- , /MdV 



UA< 



or writing (Md)'^ = c. 



llV'slli J \ n J 

R-J 



(i-xj)c/^,<4llV'.lli 



or equivalently 

J{1 - xj{0){x)) (C/VO(x) dx<^ IIV-.lli. 

X 

Since the integrand is nonncgativc, this implies 

||(l-Xj(0))t/V'.lli<4llV'.lli 
which is (1), as desired. D 

Combining these three lemmas, the converse follows. D 

Acknowledgements 

BF's research was supported by an internship at the Centre for Quantum Technologies. 



References 

[1] J. Baez, Network theory, available at http://math.ucr.edu/liome/baez/networks/. 

[2] N. Byers, E. Noether's discovery of the deep connection between symmetries and conservation 
laws, available as arXiv:physics/9807044. 

[3] Y. Kosmann-Schwarzbach, The Noether Theorems: Invariance and Conservation Laws m the 
Twentieth Century, trans. Bertram E. Schwarzbaeh, Springer, Berlin, 2011. 

[4] D. Stroock, An Introduction to Markov Processes, Springer, Berlin, 2005. 

[5] K. Yosida, Functional Analysis, Springer, Berlin, 1996. 



